Current Issue : January-March Volume : 2023 Issue Number : 1 Articles : 5 Articles
Background: Several types of RNA in the cell are usually involved in biological processes with multiple functions. Coding RNAs code for proteins while non-coding RNAs regulate gene expression. Some single-strand RNAs can create a circular shape via the back splicing process and convert into a new type called circular RNA (circRNA). circRNAs are among the essential non-coding RNAs in the cell that involve multiple disorders. One of the critical functions of circRNAs is to regulate the expression of other genes through sponging micro RNAs (miRNAs) in diseases. This mechanism, known as the competing endogenous RNA (ceRNA) hypothesis, and additional information obtained from biological datasets can be used by computational approaches to predict novel associations between disease and circRNAs. Results: We applied multiple classifiers to validate the extracted features from the heterogeneous network and selected the most appropriate one based on some evaluation criteria. Then, the XGBoost is utilized in our pipeline to generate a novel approach, called CircWalk, to predict CircRNA-Disease associations. Our results demonstrate that CircWalk has reasonable accuracy and AUC compared with other state-ofthe- art algorithms. We also use CircWalk to predict novel circRNAs associated with lung, gastric, and colorectal cancers as a case study. The results show that our approach can accurately detect novel circRNAs related to these diseases. Conclusions: Considering the ceRNA hypothesis, we integrate multiple resources to construct a heterogeneous network from circRNAs, mRNAs, miRNAs, and diseases. Next, the DeepWalk algorithm is applied to the network to extract feature vectors for circRNAs and diseases. The extracted features are used to learn a classifier and generate a model to predict novel CircRNA-Disease associations. Our approach uses the concept of the ceRNA hypothesis and the miRNA sponge effect of circRNAs to predict their associations with diseases. Our results show that this outlook could help identify CircRNA-Disease associations more accurately....
Background: In siRNA based antiviral therapeutics, selection of potent siRNAs is an indispensable step, but these commonly used features are unable to construct the boundary between potent and ineffective siRNAs. Results: Here, we select potent siRNAs by removing ineffective ones, where these conditions for removals are constructed by C-features of siRNAs, C-features are generated by MG-algorithm, Icc-cluster and the different combinations of some commonly used features, MG-algorithm and Icc-cluster are two different algorithms to search the nearest siRNA neighbors. For the ineffective siRNAs in test data, they are removed from test data by I-iteration, where I-iteration continually updates training data by adding these successively removed siRNAs. Furthermore, the efficacy of siRNAs of test data is predicted by their nearest neighbors of training data. Conclusions: By siRNAs of Hencken dataset, results show that our algorithm removes almost ineffective siRNAs from test data, gives the clear boundary between potent and ineffective siRNAs, and accurately predicts the efficacy of siRNAs also. We suggest that our algorithm can provide new insights for selecting the potent siRNAs....
Background: Essential proteins are indispensable to the development and survival of cells. The identification of essential proteins not only is helpful for the understanding of the minimal requirements for cell survival, but also has practical significance in disease diagnosis, drug design and medical treatment. With the rapidly amassing of protein–protein interaction (PPI) data, computationally identifying essential proteins from protein–protein interaction networks (PINs) becomes more and more popular. Up to now, a number of various approaches for essential protein identification based on PINs have been developed. Results: In this paper, we propose a new and effective approach called iMEPP to identify essential proteins from PINs by fusing multiple types of biological data and applying the influence maximization mechanism to the PINs. Concretely, we first integrate PPI data, gene expression data and Gene Ontology to construct weighted PINs, to alleviate the impact of high false-positives in the raw PPI data. Then, we define the influence scores of nodes in PINs with both orthological data and PIN topological information. Finally, we develop an influence discount algorithm to identify essential proteins based on the influence maximization mechanism. Conclusions: We applied our method to identifying essential proteins from saccharomyces cerevisiae PIN. Experiments show that our iMEPP method outperforms the existing methods, which validates its effectiveness and advantage....
Background: A recurring problem in image segmentation is a lack of labelled data. This problem is especially acute in the segmentation of lung computed tomography (CT) of patients with Coronavirus Disease 2019 (COVID-19). The reason for this is simple: the disease has not been prevalent long enough to generate a great number of labels. Semi-supervised learning promises a way to learn from data that is unlabelled and has seen tremendous advancements in recent years. However, due to the complexity of its label space, those advancements cannot be applied to image segmentation. That being said, it is this same complexity that makes it extremely expensive to obtain pixellevel labels, making semi-supervised learning all the more appealing. This study seeks to bridge this gap by proposing a novel model that utilizes the image segmentation abilities of deep convolution networks and the semi-supervised learning abilities of generative models for chest CT images of patients with the COVID-19. Results: We propose a novel generative model called the shared variational autoencoder (SVAE). The SVAE utilizes a five-layer deep hierarchy of latent variables and deep convolutional mappings between them, resulting in a generative model that is well suited for lung CT images. Then, we add a novel component to the final layer of the SVAE which forces the model to reconstruct the input image using a segmentation that must match the ground truth segmentation whenever it is present. We name this final model StitchNet. Conclusion: We compare StitchNet to other image segmentation models on a high-quality dataset of CT images from COVID-19 patients. We show that our model has comparable performance to the other segmentation models. We also explore the potential limitations and advantages in our proposed algorithm and propose some potential future research directions for this challenging issue....
Background: Cells progressing from an early state to a developed state give rise to lineages in cell differentiation. Knowledge of these lineages is central to developmental biology. Each biological lineage corresponds to a trajectory in a dynamical system. Emerging single-cell technologies such as single-cell RNA sequencing can capture molecular abundance in diverse cell types in a developing tissue. Many computational methods have been developed to infer trajectories from single-cell data. However, to our knowledge, none of the existing methods address the problem of determining the existence of a trajectory in observed data before attempting trajectory inference. Results: We introduce a method to identify the existence of a trajectory using three graph-based statistics. A permutation test is utilized to calculate the empirical distribution of the test statistic under the null hypothesis that a trajectory does not exist. Finally, a p-value is calculated to quantify the statistical significance for the presence of trajectory in the data. Conclusions: Our work contributes new statistics to assess the level of uncertainty in trajectory inference to increase the understanding of biological system dynamics....
Loading....